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1.  Introduction 


Under  most  recent  linguistic  theories,  linguistic  constraints  fall  into  several  subsystems 
each  having  its  own  character.  Chomsky  (1981:5),  for  instance,  identifies  the  subtheories 
of  bounding,  government,  0-marking.  binding,  Case,  and  control,  while  Shiebcr  (1983:2ff) 
describes  a  version  of  Cazdar  and  Pullum’s  GPSG  formalism  that  involves  immediate- 
dominance  rules,  linear-order  constraints,  and  metarules.  When  several  independent  con¬ 
straints  are  involved,  a  rule  system  that  explicitly  multiplies  out  their  effects  is  large, 
cumbersome,  and  uninformative.1  For  example,  as  Shiebcr  (:4)  points  out,  the  expanded 
context-free  “object  grammar”  derived  by  multiplying  out  the  constraints  in  a  typical  GPSG 
system  would  contain  trillions  of  rules. 

Given  the  disadvaj.  ages  of  multiplying  out  the  effects  of  separate  systems  of  con¬ 
straints,  Shieber’s  (1983)  work  leads  in  a  welcome  direction.  Shieber  considers  how  one 
might  do  parsing  with  ID/LP  grammars,  which  involve  two  orthogonal  kinds  of  rules,  ID 
rules  constrain  immediate  dominance  irrespective  of  constituent  order  (“a  sentence  can  be 
composed  of  V  with  NP  and  SOAR  complements”),  while  LP  rules  constrain  linear  prece¬ 
dence  among  the  daughters  of  any  node  (“if  V  and  SQAR  arc  sisters,  then  V  must  precede 
SBAR”).  Shieber  shows  how  Earley’s  (1970)  algorithm  for  parsing  context-free  grammars 
(CFGs)  can  be  adapted  to  use  the  constraints  of  ID/LP  grammars  directly,  without  the 
combinatorially  explosive  step  of  converting  the  ID/LP  grammar  into  standard  context- 
free  form.  Instead  of  multiplying  out  all  of  the  possible  surface  interactions  among  the 
ID  and  LP  roles,  Sliicber’s  algorithm  applies  them  one  step  at  a  time  as  needed.  Surely 
this  should  work  better  in  a  parsing  application  than  applying  Earley’s  algorithm  to  an 
expanded  grammar  with  trillions  of  rules,  since  the  worst-case  time  complexity  of  Earley’s 
algorithm  is  proportional  to  the  square  of  the  grammar  size! 

Shieber’s  general  approach  is  on  the  right  track.  On  pain  of  having  a  large  and  cum¬ 
bersome  rule  system,  the  parser  designer  should  first  look  to  linguistics  to  find  the  correct 
set  of  constraints  on  syntactic  structure,  then  discover  how  to  apply  some  form  of  those 
constraints  in  parsing  without  multiplying  out  all  possible  surface  manifestations  of  their 
effects. 

Nonetheless,  nagging  doubts  about  computational  complexity  remain.  Although 
Shieber  (1983:15)  claims  that  his  algorithm  is  identical  to  Earley’s  in  time  complexity, 
it  seems  almost  too  much  to  hope  for  that  the  size  of  an  ID/LP  grammar  should  enter  into 
the  time  complexity  of  ID/LP  parsing  in  exactly  the  same  way  that  the  size  of  a  CFG  enters 
into  the  time  complexity  of  CFG  parsing.  An  ID/LP  grammar  G  can  enjoy  a  huge  size  ad- 
Viuitagc  over  a  context-free  grammar  G'  for  the  same  language;  for  example,  if  G  contains 
only  the  rule  6’  — *id  abede,  the  corresponding  G'  contains  5!  =  120  rules.  In  effect,  the 
claim  that  Shieber’s  algorithm  has  the  same  time  complexity  as  Earley's  algorithm  means 
that  this  tremendously  increased  brevity  of  expression  coines  free  (up  to  a  constant).  The 
paucity  of  supporting  argument  in  Shieber’s  article  docs  little  to  allay  these  doubts: 

We  will  not  present  a  rigorous  demonstration  of  time  complexity,  but  it 
shotdd  be  clear  from  the  close  relation  between  the  presented  algorithm 

_ and  Earley's  that  the  complexity  is  that  of  Earley's  algorithm.  In  the 

'See  Barton  (108t)  for  disenrsion. 


worst  case,  where  the  LP  rules  always  specify  a  unique  ordering  for  the 
right-hand  size  of  every  ID  rule,  the  presented  algorithm  reduces  to  Ear- 
icy’s  algorithm.  Since,  given  the  grammar,  checking  the  LP  rules  takes 
constant  time,  the  time  complexity  of  the  presented  algorithm  is  identi¬ 
cal  to  Earley’s  ....  That  is,  it  is  0(]G|2  ns),  where  |G|  is  the  size  of  the 
grammar  (number  of  ID  rules)  and  n  is  the  length  of  the  input.  (:14f) 

Many  questions  remain;  for  example,  why  should  a  situation  of  maximal  constraint  represent 
the  worst  case,  as  Shieber  claims?2 

The  following  sections  will  investigate  the  complexity  of  ID/LP  parsing  in  more  detail. 
In  brief,  the  outcome  is  that  Shieber’s  direct-parsing  algorithm  usually  doe s  have  a  time 
advantage  over  the  use  of  Earley’s  algorithm  on  the  expanded  CFG,  but  that  it  blows  up  in 
the  worst  case.  The  claim  of  0(|G|2  ns)  time  complexity  is  mistaken;  in  fact,  the  worst-case 
time  complexity  of  ID/LP  parsing  cannot  be  bounded  by  any  polynomial  in  the  size  of  the 
grammar  and  input,  unless  P  =  X  P.  ID/LP  parsing  is  NP-complete. 

As  it  turns  out,  the  complexity  of  ID/LP  parsing  has  its  source  in  the  immediate- 
domination  rules  rather  than  the  linear  precedence  constraints.  Consequently,  the  prece¬ 
dence  constraints  will  be  neglected.  Attention  will  be  focused  on  unordered  context-free 
grammars  (UCFGs),  which  are  exactly  like  standard  context-free  grammars  except  that 
when  a  rule  is  used  in  a  derivation,  the  symbols  on  its  right-hand  side  are  considered  to 
be  unordered  and  hence  may  be  written  in  any  order.  UCFGs  represent  the  special  case  of 
ID/'LP  grammars  in  which  there  are  no  LP  constraints.  Shieber’s  ID/LP  algorithm  can  be 
used  to  parse  UCFGs  simply  by  ignoring  all  references  to  LP  constraints.  • 

2.  Generalizing  Earley’s  algorithm 

Shieber  generalizes  Earley’s  algorithm  by  modifying  the  progress  datum  that  tracks 
progress  through  a  rule.  The  Earley  algorithm  uses  the  position  of  a  dot  to  track  lin¬ 
ear  advancement  through  an  ordered  sequence  of  constituents.  The  major  predicates  and 
operations  on  such  dotted  rules  are  these: 

•  A  dotted  rule  is  initialized  with  the  dot  at  the  left  edge,  as  in  X  — *  .ADC. 

•  A  dotted  rule  is  advanced  across  a  terminal  or  nonterminal  that  was  predicted  and 
has  been  located  in  the  input  by  simply  moving  the  dot  to  the  right.  For  example, 
X  — *  A. DC  is  advanced  across  a  D  by  moving  the  dot  to  obtain  X  — >  AD.C. 

•  A  dotted  rule  is  complete  iff  the  dot  is  at  the  right  edge.  For  example,  X  -*  ABC. 
is  complete. 

•  A  dotted  rule  predicts  a  terminal  or  nonterminal  iff  the  dot  is  immediately  before 
the  terminal  or  nonterminal.  For  example,  X  — *  A. DC  predicts  D. 

UCFG  rules  differ  from  CFG  rules  only  in  that  the  right-hand  sides  represent  unordered 
multisets  (that  is.  sets  with  repeated  elements  allowed).  It  is  thus  appropriate  to  use  suc- 
ccssive  accumulation  of  set  elements  in  place  of  linear  advancement  through  a  sequence.  In 
5S ce  section  5;  it  is  in  fart  the  best  case. 


essence,  Shieber’s  algorithm  replaces  the  standard  operations  on  dotted  rules  with  corre¬ 
sponding  operations  on  what  will  be  called  dotted  UCFG  rules:* 

•  A  dotted  UCFG  rule  is  initialized  with  the  empty  multiset  before  the  dot  and  the 
entire  multiset  of  right-hand  elements  after  the  dot,  as  in  X  -*  {}.{A,  D,C}. 

•  A  dotted  UCFG  rule  is  advanced  across  a  terminal  or  nonterminal  that  was  pre¬ 

dicted  and  has  been  located  in  the  input  by  simply  moving  one  element  from  the 
multiset  after  the  dot  to  the  multiset  before  the  dot.  For  example,  X  — *  G} 

is  advanced  across  a  B  by  moving  the  B  to  obtain  X  — *  {A,  B}.{C).  Similarly, 
X  —*  {A}.{B,C,  C}  may  be  advanced  across  a  C  to  obtain  X  — *  {A,  C}.{B,C). 

•  A  dotted  UCFG  nde  is  complete  iff  the  multiset  after  the  dot  is  empty.  For  example, 
X  — ►  {A,B,C}.{}  is  complete. 

•  A  dotted  UCFG  rule  predicts  a  terminal  or  nonterminal  iff  the  terminal  or  nonter¬ 
minal  is  a  member  of  the  multiset  after  the  dot.  For  example,  X  — »  {A}.{B,C} 
predicts  B  and  C. 

Given  these  replacements  for  operations  on  dotted  rules,  Shieber’s  algorithm  operates  in 
the  same  way  as  Earley’s  algorithm.  As  usual,  each  state  in  the  parser’s  state  sets  consists 
of  a  dotted  rule  tracking  progress  through  a  constituent  plus  the  interword  position  defining 
the  constituent’s  left  edge  (Earley,  1970:95,  omitting  lookahead).  The  left-edge  position  is 
also  referred  to  as  the  return  pointer  because  of  its  role  in  the  complete  operation  of  the 
parser. 


3.  The  advantages  of  Shieber’s  algorithm 

The  first  question  to  ask  is  whether  Shieber’s  idgorithm  saves  anything.  Is  it  faster  to 
use  Shicber’s  algorithm  on  a  UCFG  than  to  use  Earley’s  algorithm  on  the  corresponding 
expanded  CFG?  Consider  the  UCFG  Gi  that  has  only  the  single  rule  S  — »  abode.  The 
corresponding  CFG  G't  has  120  rules  spelling  out  all  the  permutations  of  abode:  S  —*  abode, 
S  ->  abced,  and  so  forth.  If  the  string  abode  is  parsed  using  Shicber’s  algorithm  directly  on 
Gj,  the  state  sets  of  the  parser  remain  small:* 

S0:  [S  — *  {}.{o,6,  c,  ii,  e},0] 

51  :  [S  -»  (o}.{5, c,d,e},0j 

52  :  [S  -*  (a,6}.{c,  J,e},0] 

Sj  :  {*>’  —  {a,6,c}.{d,e},0j 
S<  :  [5  -*  {rt,b,c,rf}.{s},0j 
S5:  [S  ^  {a,b,r.,d,  c}.{},0] 

In  contrast,  consider  what  happens  if  the  same  string  is  parsed  using  Earley’s  algorithm  on 
the  expanded  CFG  with  its  120  rules.  As  Figure  1  illustrates,  the  state  sets  of  the  Earley 

•'Shicber’s  representation  differs  in  some  ways  from  tlie  representation  ttsed  here,  which  was  developed 
independently  by  the  author.  The  differences  are  generally  inessential,  but  see  note  5. 

^Thc  •■dates  related  to  the  auxiliary  start  symbol  and  endmarker  that  are  adiied  by  some  versions  of  the 
Earley  parser  have  been  omitted  for  simplicity. 
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(a)  [S -+ {a}.{b,c,d,e},q 


a.edcb,  0] 
a.decb,  0] 
a.ectib,0] 
a.cedb,  0] 
a.dceb,  0] 
a.cJe6,0] 
a.edbc ,  0] 
a.de6c,0] 
a.ebdc,  0] 
a.be.dc,  Oj 
a.dbec,  0] 
a.bdec,  0] 


[S  -»  a, 
[S  — *  a 
[S  -»  a 
[5  — » a 
[S  -»  a, 
[S  -  a, 
JS^a, 
[5 -a 
[S  — »  a 
[S  — *  o, 
[5  — *  a, 
[5  -♦  a, 


. ecbd ,  0] 
.ceM,  0) 
.ef>ed,0] 
.heed,  oj 
.cbed,  oj 
.bced,  oj 
.dcbe,  oj 
.cdbe,  oj 
.dbce,  oj 
.bdce,  oj 
.cbde,  oj 
.bcde,  oj 


Figure  1:  The  use  of  the  Sliicber  parser  on  a  UCFG  can  enjoy  a  large  advantage  over  the 
use  of  the  Earley  parser  on  the  corresponding  expanded  CFG.  After  having  processed  the 
terminal  a  while  parsing  the  string  abede  as  discussed  in  the  text,  the  Shiebrr  parser  uses 
the  single  state  shown  in  (a)  to  keep  track  of  the  same  information  for  winch  the  Earley 
parser  uses  the  21  states  in  (b). 


parser  are  much  larger.  In  state  set  5j ,  the  Earley  parser  uses  4!  =  24  states  to  spell  out 
all  the  possible  orders  in  which  the  remaining  symbols  {b.c,d,e}  could  appear.  Shieber’s 
modified  parser  does  not  spell  them  out,  but  uses  the  single  state  [5  — ►  {a}.{6,c,  d,e},0]  to 
summarise  them  all.  Shicbcr’s  algorithm  should  thus  be  faster,  since  both  parsers  work  by 
successively  processing  all  of  the  states  in  the  state  sets. 

Similar  examples  show  that  the  Shiebcr  parser  can  enjoy  an  arbitrarily  large  advantage 
over  the  use  of  the  Earley  parser  on  the  expanded  CFG.  Instead  of  multiplying  out  .ill  surface 
appearances  ahead  of  time  to  produce  an  expanded  CFG,  Shicber’s  algorithm  works  out 
the  possibilities  one  step  at  a  time,  as  needed.  This  can  be  an  advantage  because  not  all  of 
the  possibilities  may  arise  with  a  particular  input. 


4.  Combinatorial  explosion  with  Shieber’s  algorithm 


The  answer  to  the  first  question  is  yes,  then:  it  can  be  more  efficient  to  use  Shieber’s 
parser  than  to  use  the  Earley  parser  on  an  expanded  “object  grammar.”  The  second  question 
to  ask  is  whether  Shieber's  parser  always  enjoys  a  large  advantage-.  Does  the  algorithm  blow 
up  in  difficult  cases? 

In  the  presence  of  lexical  ambiguity,  Shieber’s  algorithm  can  suffer  from  combinatorial 


explosion.  Consider  the  following  UCFG,  Gq,  in  which  x  is  five-ways  ambiguous: 

S  -  ABODE 
A  -+  a  \  x 
0  b  |  x 
C  —*  c  \  x 
D  -*  d\x 
E  -»  1 1  x 

What  happens  if  Shicbcr’s  algorithm  is  used  to  parse  the  string  xxxxa  according  to  this 
grammar?  After  the  first  three  occurrences  of  x  have  been  processed,  the  state  set  of 
Shieber’s  parser  will  reflect  the  possibility  that  any  three  of  the  phrases  A,  B,  C,  D ,  and  E 
might  have  been  encountered  in  the  input  and  any  two  of  them  might  remain  to  be  parsed. 
There  will  be  (5)  =  10  states  reflecting  progress  through  the  rule  expanding  S,  in  addition  to 
5  states  reflecting  phrase  completion  and  10  states  reflecting  phrase  prediction  (not  shown): 

Si  :  [5  -»  {A,Z?,C}.{D,E},0]  [S  -»  {A,B,D}.{C,E},0] 

[. S  -  {A,  C,  D).{B ,  E },  0)  [5  -*  {B,  C,  D}.{ A,  E),  0] 

[S  -  {A,B,E}.{C,D), 0]  [S  -  {A,C,E}.{B,D}, 0] 

[5  -*  {D,C,E}.{A,D},0]  [S-+  {A,D,E}.{B,C},0\ 

[S-  {B,D,E}.{A,C},0}  [S-  [C,D,E}.{A,B},0] 

In  cases  like  this,  Shieber  s  algorithm  enumerates  all  of  the  combinations  of  fc  elements  taken 
»  at  a  time,  where  k  is  the  rule  length  and  »  is  the  number  of  elements  already  processed. 
Thus  it  can  be  combinatorially  explosive. 

It  is  important  to  note  that  even  in  this  case,  Shicber’s  algorithm  wins  out  over  parsing 
the  expanded  CFG  with  Earley’s  algorithm.  After  the  same  input  symbols  have  been 
processed,  the  state  set  of  the  Earley  parser  will  reflect  the  same  possibilities  as  the  state 
set  of  the  Shieber  parser:  any  three  of  the  required  phrases  might  have  been  located,  while 
any  two  of  them  might  remain  to  be  parsed.  However,  the  Earley  parser  has  a  less  concise 
representation  to  work  with.  In  place  of  the  state  involving  S  — »  {A,  D,C}.{D,E},  for 
instance,  there  will  be  3!  •  2!  =  12  states  involving  S  — »  ABC.DE,  S  — >  BCA.ED,  and  so 
forth.5  Instead  of  a  total  of  25  states,  the  Earley  state  set  will  contain  135  =  12-10+15 
states. 

In  the  above  case,  although  the  parser  could  not  be  sure  of  the  catcgorial  identities  of 
the  phrases  parsed,  at  least  there  was  no  uncertainty  about  the  number  of  phrases  and  their 
extent.  We  can  make  matters  even  worse  for  the  parser  by  introducing  uncertainty  in  those 
areivs  as  well.  Let  Gj  be  the  result  of  replacing  every  z  in  Gj  with  the  empty  string  e: 

S  -*  ABODE 
A  -+  a  |  e 
B  —  6  |  e 

D-+d\t 
E  -*  e  j  e 

8 In  contrast  to  the  rej>re*ontation  illustrated  here,  Shieber’s  representation  actually  suffers  to  some  extent 
from  the  same  problem.  Shieber  (1983:10)  uses  an  ordered  sequence  instead  of  a  multiset  before  the  dot; 
consequently,  in  place  of  the  state  involving  5  -e  {4,  D, C}.{D, ff},  Shieber  would  have  the  3!  =  6  states 
involving  S  -*  a. {D,E},  where  a  ranges  over  the  six  permutations  of  ABC. 
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Then  an  A,  for  instance,  can  be  either  an  a  or  nothing.  Before  any  input  has  been  read, 
the  first  state  set  So  in  Shieber’s  parser  must  reflect  the  possibility  that  the  correct  parse 
may  include  any  of  the  25  =  32  possible  subsets  of  A,  D ,  C,  D,  and  E  as  empty  initial 
constituents.  For  example,  Sq  must  include  [S  — *  {A,  B,  C,  D,  £}•{},  0]  because  the  input 
might  turn  out  to  be  the  null  string.  Similarly,  it  must  include  [S  — »  {A,C,E}.{D,D},0} 
because  the  input  might  turn  out  to  be  bd  or  db.  Counting  all  possible  subsets  in  addition  to 
other  states  having  to  do  with  predictions,  completions,  and  the  parser’s  start  symbol,  there 
arc  44  states  in  Sq.  (There  are  338  states  in  the  corresponding  state  when  the  expanded 
CFG  Gj  is  used.) 

5.  The  source  of  the  difficulty 

Why  is  Shieber’s  algorithm  potentially  exponential  in  grammar  size  despite  its  “close 
relation”  to  Earley’s  algorithm,  which  has  time  complexity  polynomial  in  grammar  size? 
The  answer  lies  in  the  size  of  the  state  space  that  each  parser  uses.  Relative  to  grammar  size, 
Shieber’s  algorithm  involves  a  much  larger  bound  than  Earley’s  algorithm  on  the  number 
of  states  in  a  state  set.  Since  the  main  task  of  the  Earley  parser  is  to  perform  scan,  predict, 
and  complete  operations  on  the  states  in  each  state  set  (Earley,  1970:97),  an  explosion  in 
the  size  of  the  state  sets  will  be  fatal  to  any  small  runtime  bound. 

Given  a  CFG  Ga,  how  many  possible  dotted  rules  are  there?  Resulting  from  each  rule 
X  — »  A  i . . .  Ak,  there  are  fc  +  1  possible  dotted  rules.  Then  the  number  of  possible  dotted 
rules  is  bounded  by  |G0|,  if  this  notation  is  taken  to  mean  the  number  of  symbols  that  it 
takes  to  write  G„  down.  An  Earley  state  is  a  pair  [r,  *],  where  r  is  a  dotted  rule  and  i  is 
an  interword  position  ranging  from  0  to  the  length  n  of  the  input  string.  Because  of  these 
limits,  no  state  set  in  the  Earley  parser  can  contain  more  than  0(|G„|  •  n)  (distinct)  states. 

The  limited  size  of  a  state  set  allows  an  0(\Ga\2  ■  ns)  bound  to  be  placed  on  the 
runtime  of  the  Earley  parser.  Informally,  the  argument  (due  to  Earley)  runs  as  follows. 
The  scan  operation  on  a  state  can  be  done  in  constant  time;  the  scon  operations  in  a 
state  set  thus  contribute  no  more  than  0(|G„|  •  n)  computational  steps.  All  of  the  predict 
operations  in  a  state  set  taken  together  can  add  no  more  states  than  the  number  of  rules 
in  the  grammar,  bounded  by  |G„|.  since  a  nonterminal  needs  to  be  expanded  only  once  in 
a  state  set  regardless  of  how  many  times  it  is  predicted;  hence  the  predict  operations  need 
not  take  more  than  0(|G„|  •  n  +  |G„|)  =  0(|G„|  •  n)  steps.  Finally,  there  arc  the  complete 
operations  to  be  considered.  A  given  completion  can  do  no  worse  than  .advancing  every 
state  in  the  state  set  indicated  by  the  return  pointer.  Therefore,  k  completions  require  at 
most  fc2  steps;  the  complete  operations  in  a  state  set  can  take  no  more  than  0(|G„|2  •  n2) 
steps.  Overall,  then,  it  takes  no  more  than  G(|G„|2  •  n2)  steps  to  process  one  state  set  and 
no  more  than  0(|G„|2  ■  n3)  steps  for  the  Earley  parser  to  process  them  all. 

In  Shieber’s  parser,  though,  the  state  sets  can  grow  much  larger  relative  to  grammar 
size.  Given  a  UCFG  GV  how  many  possible  dotted  UCFG  rules  are  there?  Resulting  from 
a  rule  X  -♦  A  i ...  A*,  there  arc  not  k  +  1  possible  dotted  rules  tracking  linear  adv.uicement, 
but  2k  possible  dotted  UCFG  rules  tracking  accumulation  of  set  elements.  In  the  worst 
case,  the  grammar  contains  only  one  rule  and  k  is  on  the  order  of  |Gj|;  hence  the  number 


Figure  2:  This  graph  illustrates  a  trivial  instance  of  the  vertex  cover  problem.  The  set 
{c,  d}  is  a  vertex  cover  of  sire  2. 


of  pos.'ihle  dotted  UCFG  rules  for  the  whole  grammar  is  not  bounded  by  |Gt|,  but  by  2^‘L 
(Recall  the  exponential  blowup  demonstrated  for  grammar  G*  in  section  4.) 

Informally  speaking,  the  reason  why  Shiebcr’s  parser  sometimes  suffers  from  combi¬ 
natorial  explosion  is  that  there  are  exponentially  more  possible  ways  to  progress  through 
an  unordered  rule  expansion  than  an  ordered  one.  When  disambiguating  information  is 
scarce,  the  parser  must  keep  track  of  all  of  them,  in  the  more  general  task  of  parsing 
ID/LP  grammars,  the  most  tractable  case  occurs  when  constraint  from  the  LP  relation  is 
strong  enough  to  force  a  unique  ordering  for  ev'  ry  rule  expansion.  Under  such  conditions, 
Shieber’s  parser  reduces  to  Earley’s.  However,  the  case  of  strong  constraint  represents  the 
best  case  computationally,  rather  than  the  worst  case  as  Shieber  (1983:14)  claims. 


6.  ID/LP  parsing  is  inherently  difficult 

The  worst-c;ise  time  complexity  of  Shicber’s  algorithm  is  exponential  in  grammar  size 
rather  than  quadratic  as  Shieber  (1983:15)  believed.  Did  Shieber  simply  choose  a  poor 
algorithm,  or  is  ID/LP  parsing  inherently  difficult  in  the  general  case?  In  fact,  the  simpler 
problem  of  recognizing  sentences  according  to  a  UCFG  is  NP-completc.®  Consequently,  un¬ 
less  P  =  UP,  no  algorithm  for  ID/LP  parsing  can  have  a  runtime  bound  that  is  polynomial 
in  the  size  of  the  grammar  and  input. 

The  proof  of  NP-complcteness  involves  reducing  the  vertex  cover  problem  (Garey 
and  Johnson,  1979:46)  to  the  UCFG  recognition  problem.  Through  careful  construction 
of  the  grammar  .and  input  string,  it  is  possible  to  “trick”  the  parser  into  solving  a  known 
hard  problem.  The  vertex  cover  problem  involves  finding  a  small  set  of  vertices  in  a  graph 
with  the  property  that  every  edge  of  the  graph  has  at  least  one  endpoint  in  the  set.  Figure  2 
shows  a  trivial  example. 

To  construct  a  grammar  that  encodes  the  question  of  whether  the  graph  in  Figure  2 
has  a  vertex  cover  of  size  2.  first  take  the  vertex  names  a,  6,  c,  and  d  ns  the  alphabet.  Take 

Recognition  is  simpler  than  parsing  because  a  reengnirer  is  not  required  to  recover  the  structure  of  an  input 
string,  but  only  to  decide  whether  the  string  is  in  the  language  generated  by  the  grammar:  that  is,  whether 
or  not  there  exists  a  parse. 
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»  *_»•  V  *_p 


START 


HlH2HsHiUUDDDD 


aaaa  |  bfefcft  |  cccc  |  dddd 
a  |  6  j  c  |  d 


Figure  3:  For  Jfc  =  2,  the  construction  described  in  the  text  transforms  the  vortex-cover 
problem  of  Figure  2  into  this  UOFG.  A  parse  exists  for  the  string  aaaabbbbccc.cdddd  iff  the 
graph  in  the  previous  figure  has  a  vertex  cover  of  size  <  2. 


START  as  the  start  symbol.  Take  Hi  through  Hi  as  special  symbols,  one  per  edge;  also 
take  U  and  D  as  special  dummy  symbols. 

Next,  write  the  rules  corresponding  to  the  edges  of  the  graph.  Edge  ei  runs  from  o 
to  c,  so  include  the  rules  Hi  — *  «  and  Hi  — *  c.  Encode  the  other  edges  similarly.  Rules 
expanding  the  dummy  symbols  arc  also  needed.  Dummy  symbol  D  will  be  used  to  soak  up 
excess  input  symbols,  so  D  -*  a  through  D  — *  d  should  be  rules.  Dummy  symbol  U  will 
also  be  used  to  soak  up  excess  input  symbols,  but  U  will  be  allowed  to  match  only  when 
there  arc  four  occurrences  in  a  row  of  the  same  symbol  (one  occurrence  for  each  edge).  Take 
U  — *  anna,  U  — *  bbbb,  and  U  — *  cccc,  and  U  —*  dddd  as  the  rules  expanding  U. 

Now,  what  does  it  take  for  the  graph  to  have  a  vertex  cover  of  size  k  =  2?  One  way 
to  get  a  vertex  cover  is  to  go  through  the  list  of  edges  and  underline  one  endpoint  of  each 
edge.  If  the  vertex  cover  is  to  be  of  size  2,  the  underlining  must  be  done  in  such  a  way  that 
only  two  distinct  vertices  are  ever  touched  in  the  process.  Alternatively,  since  there  arc  4 
vertices  in  all,  the  vertex  cover  will  be  of  size  2  if  there  are  4  —  2  =  2  vertices  left  untouched 
in  the  underlining  process.  This  method  of  finding  a  vertex  cover  can  be  translated  into  a 
UCFG  rule  as  follows: 

START  ->  HiH2HiHAUUDDDD 

That  is,  each  //-symbol  is  supposed  to  match  the  name  of  one  of  the  endpoints  of  the 
corresponding  edge,  in  .accordance  with  the  rules  expanding  the  //-symbols.  Each  //-symbol 
is  supposed  to  correspond  to  a  vertex  that  was  left  untouched  by  the  //-matching,  and  the 
D-symbols  are  just  there  for  bookkeeping.  Figure  3  lists  the  complete  grammar  that  encodes 
the  vertex-cover  problem  of  Figure  2. 

To  make  all  of  this  work  properly,  take 

a  —  aaaabbbbcc.ccdddd 

as  the  input  string  to  be  parsed.  (In  general,  for  every  vertex  name  x,  include  in  o  a 
contiguous  run  of  occurrences  of  x,  one  occurrence  for  each  edge  in  the  graph.)  The  grammar 
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encodes  the  underlining  procedure  by  requiring  each  //-symbol  to  match  one  of  its  endpoints 
in  a.  Since  the  right-hand  side  of  the  START  rule  is  unordered,  the  grammar  allows  an 
//-symbol  to  match  anywhere  in  the  input,  hence  to  match  any  vertex  name  (subject  to 
interference  from  other  rules  that  have  already  matched).  Furthermore,  since  there  is  one 
occurrence  of  each  vertex  name  for  every  edge,  all  of  the  edges  could  conceivably  be  matched 
up  with  the  same  vertex:  that  is,  it’s  impossible  to  run  out  of  vertex- name  occurrences. 
Consequently,  the  grammar  will  allow  either  endpoint  of  an  edge  to  be  “underlined.”  The 
parser  will  have  to  figure  out  which  endpoints  to  choose  —  in  other  words,  which  vertex  cover 
to  select.  However,  the  grammar  also  requires  two  occurrences  of  U  to  match  somewhere. 
U  can  only  match  four  contiguous  identical  input  symbols  that  have  not  been  matched  ii? 
any  other  way,  .and  thus  if  the  parser  chooses  a  vertex  cover  that  is  too  large,  the  (/-symbols 
will  not  match  and  the  parse  will  fail.  The  proper  number  of  D-symbols  is  given  by  the 
length  of  the  input  string,  minus  the  number  of  edges  in  the  graph  (to  account  for  the 
//^-matches),  minus  k  times  the  number  of  edges  (to  account  for  the  (/-matches):  in  this 
case,  1C  -  4  -  (2  •  4)  =  i,  as  illustrated  in  the  START  rule. 

The  net  result  of  this  construction  is  that  in  order  to  decide  whether  a  is  in  the  language 
generated  by  the  UCFG,  the  parser  must  in  effect  search  for  a  vertex  cover  of  size  2  or  less.7 
If  a  parse  exists,  an  appropriate  vertex  cover  can  be  read  off  from  beneath  the  //-symbols  in 
the  parse  tree;  conversely,  if  mi  appropriate  vertex  cover  exists,  it  indicates  how  to  construct 
a  parse.  Figure  4  shows  the  parse  tree  that  encodes  a  solution  to  the  vertex-cover  problem 
of  Figure  2. 

The  construction  shows  that  vertex-cover  problem  is  reducible  to  UCFG  recognition. 
Furthermore,  the  construction  of  the  grammar  and  input  string  can  be  carried  out  in  poly¬ 
nomial  time.  Consequently,  UCFG  recognition  and  the  more  general  task  of  ID/LP  parsing 
must  be  computationally  difficult.  For  a  more  careful  and  detailed  treatment  of  the  reduc¬ 
tion  and  its  correctness,  see  the  appendix. 

7.  Computational  implications 

The  reduction  of  Vertex  Cover  shows  that  the  ID/LP  parsing  problem  is  NP-complete. 
Unless  P  —  R  P ,  the  time  complexity  of  ID/LP  parsing  cannot  be  bounded  by  any  polyno¬ 
mial  in  the  size  of  the  grammar  and  input.8  An  immediate  conclusion  is  that  complexity 
analysis  must  be  done  carefully:  despite  its  similarity  to  Earley’s  algorithm,  Shieber’s  algo¬ 
rithm  does  not  have  complexity  0(|G|2  •  n3).  For  some  choices  of  grammar  and  input,  its 
internal  structures  undergo  exponential  growth.  Other  consequences  also  follow. 

7.1.  Parsing  the  object  grammar 

Even  in  the  face  of  its  combinatorially  explosive  worst-case  behavior,  Shieber’s  algo- 

7 If  the  vertex  cover  is  smaller  than  expected,  the  /^-symbols  will  soak  up  the  extra  contiguous  runs  that 
could  have  been  matched  by  mure  (/-symbols. 

HEvtn  ;u*au!iiing  P  *'  .VP,  it  does  not  folltw  that  the  time  complexity  must  be  exponential,  though  it  seeins 
likely  to  be.  There  .ire  functions  such  as  that  fall  between  polynomials  and  exponentials.  See 

Hopn oft  and  Ullman  (1079:341). 
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START 


Figure  4:  The  grammar  of  Figure  3,  which  encodes  the  vertex-cover  problem  of  Figure  2, 
generates  the  string  a  =  aaaabbbbccccdddd  according  to  this  parse  tree.  The  vertex  cover 
{c,  d)  can  be  read  off  from  the  parse  tree  as  the  set  of  elements  dominated  by  H -symbols. 


rithm  should  not  be  immediately  cast  aside.  Despite  the  fact  that  it  sometimes  blows  up, 
it  still  has  an  advantage  over  the  alternative  of  parsing  the  expanded  “object  grammar.” 
One  interpretation  of  the  NP-complctcness  result  is  that  the  general  case  of  ID/LP  parsing 
is  inherently  difficult;  hence  it  should  not  be  surprising  that  Sliieber’s  algorithm  for  solving 
that  problem  can  sometimes  suffer  from  combinatorial  explosion.  More  significant  is  the  fact 
that  parsing  with  the  expanded  CFG  blows  up  in  cases  that  should  not  be  difficult.  There 
is  nothing  inherently  difficult  about  parsing  the  language  that  consists  of  all  permutations 
of  the  string  abede,  but  while  parsing  that  language  the  Earley  parser  can  use  24  states  or 
more  to  encode  what  the  Shiebcr  parser  encodes  in  only  one  (§3).  To  put  the  point  another 
way,  the  significant  fact  is  not  that  the  Shieber  parser  can  blow  up;  it  is  that  the  use  of  an 
expanded  CFG  blows  up  unnecettarily. 

7.2.  Is  precompilation  possible? 

The  present  reduction  of  Vertex  Cover  to  ID/LP  Parsing  involves  constructing  a  gram¬ 
mar  and  input  string  that  both  depend  on  the  problem  to  be  solved.  Consequently,  the 
reduction  does  not  rule  out  the  possibility  that  through  clever  programming  one  might 
concentrate  most  of  the  computational  difficulty  of  ID/LP  parsing  into  a  separate  preeom- 
ptlation  stage,  dependent  on  the  grammar  but  independent  of  the  input.  According  to  this 
optimistic  scenario,  the  entire  procedure  of  preprocessing  the  grammar  and  parsing  the  in¬ 
put  string  would  be  as  difficult  as  any  NP-eomplctc  problem,  but  after  precompilation,  the 
time  required  for  parsing  a  particular  input  would  be  bounded  by  a  polynomial  in  grammar 
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size  and  sentence  length. 

Regarding  the  rase  immediately  at  hand,  Shieber’s  modified  Earley  algorithm  has  no 
prerompilation  step.9  The  complexity  result  implied  by  the  reduction  thus  applies  with 
full  force;  any  possible  precompilation  phase  has  yet  to  be  proposed.  Moreover,  it  is  by  no 
means  clear  that  a  clever  precompilation  step  is  even  possible;  it  depends  on  exactly  how 
|G|  and  n  enter  into  the  complexity  function  for  ID/LP  parsing.  If  n  enters  as  a  factor 
multiplying  an  exponential,  precompilation  cannot  help  enough  to  ensure  that  the  parsing 
phase  will  run  in  polynomial  time. 

For  example,  suppose  some  parsing  problem  is  known  to  require  2^  •  n*  steps  for 
solution.10  If  one  is  willing  to  spend,  say,  10  •  2iG>  steps  in  the  precompilation  phase,  is  it 
possible  to  reduce  parsing-phase  complexity  to  something  like  |G|8  •  ns?  The  answer  is  no. 
Since  by  hypothesis  it  takes  at  least  2G  ■  n*  steps  to  solve  the  problem,  there  must  be  at 
least  2:G;  •  n3  -  10  ■  2lG]  steps  left  to  perform  after  the  precompilation  phase.  The  parameter 
n  is  necessarily  absent  from  the  precompilation  complexity,  hence  the  term  2^  •  n*  will 
eventually  dominate. 

In  a  related  vein,  suppose  the  precompilation  step  is  conversion  from  ID/LP  to  CFG 
form  and  the  runtime  step  is  the  use  of  the  Earley  parser  on  the  expanded  CFG.  Although 
the  precompilation  step  does  a  potentially  exponential  amount  of  work  in  producing  G' 
from  G,  another  exponential  factor  still  shows  up  at  runtime  because  |G'|  in  the  complexity 
bound  | G'|2  n3  is  exponentially  larger  than  the  original  |G|. 

7.3.  Polynomial-time  parsing  of  a  fixed  grammar 

As  noted  above,  both  grammar  and  input  in  the  current  vertex-cover  reduction  de¬ 
pend  on  the  vertex-cover  problem  to  be  solved.  The  NP-completeness  result  would  be 
strengthened  if  there  were  a  reduction  that  used  the  same  fixed  grammar  for  all  vertex- 
cover  problems,  for  it  would  then  be  possible  to  prove  that  a  precompilation  phase  would 
be  of  little  avail.  However,  unless  P  =  NP.il  is  impossible  to  design  such  a  reduction.  Since 
grammar  size  is  not  considered  to  be  a  parameter  of  a  fixed-grammar  parsing  problem,  the 
use  of  the  Earley  parser  on  the  object  grammar  constitutes  a  polynomial-time  algorithm  for 
solving  the  fixed-grammar  ID/LP  parsing  problem. 

Although  ID/LP  parsing  for  a  fixed  grammar  can  thus  be  done  in  cubic  time,  that  fact 
has  little  practical  significance.  The  object  grammar  G'  corresponding  to  a  practical  ID/LP 
grammar  would  be  huge,  and  if  |G'|J  •  n3  complexity  is  too  slow,  then  it  remains  too  slow 
when  |G'|2  is  regarded  as  a  constant. 

7.4.  The  power  of  the  UCFG  formalism 

The  Vertex  Cover  reduction  also  helps  pin  down  the  computational  power  of  the  UCFG 
formalism.  As  Gi  and  G\  in  section  3  illustrated,  a  UCFG  (or  an  ID/LP  grammar)  can  enjoy 

9Shiobcr  (1083:15  n.  0)  mentions  a  possible  precompilation  step,  but  it  is  concerned  with  the  LP  relation 
rather  than  the  ID  rules. 

,0lt  is  not  known  whether  the  worst-case  complexity  of  ID/LP  parsing  is  exponential,  since  more  generally 
it  is  not  known  for  sure  that  P  IMP. 
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considerable  brevity  of  expression  compared  to  the  equivalent  CFG.  The  NP-completeness 
result  illuminates  this  property  in  two  ways.  First,  the  result  shows  that  this  brevity  of 
expression  is  sufficient  to  allow  an  instance  of  any  problem  in  UP  to  be  stated  hi  a  UCFG 
that  is  only  polynomially  larger  than  the  original  problem  instance.  In  contrast,  if  an 
attempt  is  made  to  replicate  the  current  reduction  with  a  CFG  rather  than  UCFG,  the 
necessity  of  spelling  out  all  the  orders  in  which  the  if-,  (/-,  and  D-3ymbols  might  appear 
makes  the  CFG  more  than  polynomially  larger  than  the  problem  instance.  Consequently, 
the  reduction  fails  to  establish  NP-complctcness,  which  indeed  does  not  hold.  Second, 
the  result  shows  that  the  increased  expressive  power  does  not  come  free;  while  the  CFG 
recognition  problem  can  be  solved  in  cubic  time  or  less,11  unless  P  =  UP  the  general  UCFG 
recognition  problem  cannot  be  solved  in  polynomial  time. 

The  details  of  the  reduction  also  help  pin  down  how  powerful  a  single  UCFG  rule  can 
be.  If  the  UCFG  formalism  is  extended  to  permit  ordinary  CFG  rules  in  addition  to  rules 
with  unordcrcd  expansions,  the  grammar  that  expresses  a  vertex-cover  problem  needs  only 
one  UCFG  rule,  although  that  rule  may  need  to  be  arbitrarily  long. 

7.5.  The  role  of  constraint 

Finally,  the  discussion  of  section  5  illustrates  the  way  in  which  the  weakening  of  con¬ 
straints  can  often  make  a  problem  computationally  more  difficult.  It  might  erroneously  be 
thought  that  weak  constraints  represent  the  best  case  in  computational  terms,  for  “weak* 
constraints  sound  easy  to  verify.  However,  oftentimes  the  weakening  of  constraint  multiplies 
the  number  of  possibilities  that  must  be  considered  in  the  course  of  solving  a*problem.  In 
the  case  at  hand,  the  removal  of  constraints  on  the  order  in  which  constituents  can  appear 
causes  the  dependence  of  parsing  complexity  on  grammar  sise  to  grow  from  |G|J  to  2|C|. 

8.  Linguistic  implications 

Significantly,  the  key  ingredients  that  can  cause  difficulties  for  the  ID/LP  parsing  al¬ 
gorithm  are  not  exotically  foreign  to  linguistic  theory.  Most  current  formalisms  (e.g.  GB- 
theory  and  GPSG)  permit  the  existence  of  constituents  that  arc  empty  on  the  surface;  hence 
in  principle  they  permit  the  kind  of  pathological  case  illustrated  by  G j  in  section  4,  subject 
to  amelioration  by  additional  constraints.  Similarly,  a  key  ingredient  of  the  vertex-cover 
reduction  is  lexical  ambiguity  —  acknowledged  by  every  current  theory. 

Nonetheless,  the  implications  of  the  NP-completencss  result  for  grammatical  theory 
are  fewer  than  they  might  seem.  The  reduction  contributes  to  the  necessary  goal  of  under¬ 
standing  the  computational  power  of  various  mechanisms  and  formal  devices,  but  it  does 
not  (for  instance)  rule  out  the  use  of  formalisms  that  decouple  constraints  on  order  from 
constraints  on  linear  precedence. 

Under  the  assumption  that  natural  languages  are  efficiently  parsable,  computational 
difficulties  in  parsing  a  formalism  do  indicate  that  the  formalism  itself  does  not  (ell  the 
“Since  0(\G?  •  n'1)  <  0((!f»j  -t  n)1),  the  complexity  of  Earley’s  algorithm  is  no  worse  than  cubic  iu  the 
combined  length  of  grammar  aud  input. 


whole  story.  That  is,  they  point  out  that  the  range  of  possible  languages  has  been  incor¬ 
rectly  characterized:  the  additional  constraints  that  guarantee  efficient  payability  remain 
unstated.  Since  the  general  case  of  parsing  ID/LP  grammars  is  computationally  difficult,  if 
the  linguistically  relevant  ID/LP  grammars  are  to  be  efficiently  parsable,  there  must  be  ad¬ 
ditional  factors  that  guarantee,  say,  a  certain  amount  of  constraint  from  the  LP  relation.12 
(Constraints  beyond  the  bare  ID/LP  formalism  are  required  on  linguistic  grounds  as  well.) 
Note  that  the  subset  principle  of  language  acquisition  ( cf .  Berwick  and  Weinberg,  1984:233) 
would  lead  the  language  learner  to  initially  hypothesize  strong  order  constraints,  to  be  weak¬ 
ened  only  in  response  to  positive  evidence. 

However,  there  arc  other  potential  ways  to  guarantee  efficient  parsability.  It  might  turn 
out  that  the  principles  and  parameters  of  the  best  grammatical  theory  permit  languages  that 
are  not  efficiently  parsable  in  the  worst  case  —  just  as  grammatical  theory  permits  sentences 
that  are  deeply  ccutcr-cmbcdded  (Miller  and  Chomsky,  19G3).1S  In  such  a  situation,  difficult 
languages  or  sentences  would  not  be  expected  to  turn  tip  in  general  use,  precisely  because 
they  would  be  difficult  to  process. 14  The  factors  that  guarantee  efficient  parsability  would 
not  be  part  of  grammatical  theory  because  they  would  result  from  extragrammatical  factors, 
i.e.  the  resource  limitations  of  the  language-processing  mechanisms.  This  “easy  way  out” 
is  not  automatically  available,  depending  as  it  does  on  a  detailed  account  of  processing 
mechanisms.  For  example,  in  the  Earley  parser,  the  difficulty  of  parsing  a  construction 
can  vary  widely  with  the  amount  of  lookahead  used  (if  any).  Like  any  other  theory,  an 
explanation  based  on  resource  limitations  must  make  the  right  predictions  about  which 
constructions  will  be  difficult  to  parse. 

In  the  same  way,  the  language-acquisition  procedure  could  potentially  be  the  source  of 
some  constraints  relevant  to  efficient  parsability.  Perhaps  not  all  of  the  languages  permitted 
by  the  principles  and  parameters  of  syntactic  theory  arc  accessible  in  the  sense  that  they 
can  potentially  be  constructed  by  the  language-acquisition  component.  It  is  to  be  expected 
that  language-acquisition  mechanisms  will  be  subject  to  various  kinds  of  limitations  just 
as  all  other  mental  mechanisms  are.  Again,  however,  concrete  conclusions  must  await  a 
detailed  proposal. 


12  In  the  GB-framcwork  of  Chomsky  (1081),  for  instance,  the  syntactic  expression  of  unordered  0-grids  at  the 
X  level  is  constrained  by  the  principles  of  Case  theory.  Endoccntricity  is  another  significant  constraint.  See 
also  Berwick's  (1082)  discussion  of  constraints  that  could  be  placed  on  another  grammatical  formalism  — 
lexical-functional  grammar  —  to  avoid  a  similar  intractability  result. 

3 Indeed,  one  may  not  conclude  a  priori  that  the  languages  permitted  by  linguistic  theory  are  parsable  at  all 
(Chomsky,  1080). 

"It  is  often  anecdotally  remarked  that  languages  that  allow  relatively  free  word  order  tend  to  make  heavy 
use  of  inflect  ions.  A  rich  inflectional  system  con  supply  parsing  constraints  that  make  up  for  the  lack 
of  ordering  constraints;  thus  the  situation  we  do  not  find  is  the  computationally  dillicult  case  of  weak 
constraint. 
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9.  Appendix 


This  appendix  contains  the  details  of  a  careful  reduction  of  the  vertex-cover  problem  to 
the  UCFG  recognition  problem.  This  version  of  the  reduction  establishes  that  the  difficulty 
of  UCFG  recognition  is  not  due  cither  to  the  possibility  of  empty  constituents  (e-rules)  or 
to  the  possibility  of  repeated  symbols  in  rules  (».e.  to  the  use  of  multisets  rather  than  sets). 
Consequently,  it  is  somewhat  different  from  and  more  complex  than  the  one  sketched  in  the 
text. 


9.1.  Defining  unordered  context-free  grammars 

Definition:  An  unordered  CFG  (UCFG)  is  a  quadruple  (N.  £,  R,  S),  where: 

(a)  N  is  a  finite  set  of  nonterminals. 

(b)  E  disjoint  from  AT  is  a  finite,  nonempty  set  of  terminal  symbols. 

(c)  R  is  a  nonempty  set  of  rules  (A, a),  where  A  €  N  and  a  €  (N  U  E)*.  The  rule 
(A,  a)  may  be  written  as  A  — *  a. 

(d)  S  G  N  is  the  start  symbol. 

Convention:  The  grammar  G  and  its  components  N,E,R,S  need  not  be  explicitly  men¬ 
tioned  when  clear  from  context. 

Convention:  Unless  otherwise  noted, 

(a)  A,A',Ai,...  denote  elements  of  N\  • 

(b)  a, a', o<, . . .  denote  elements  of  E; 

(c)  X, Y, X',Y', Xit Yu...  denote  elements  of  N U E; 

(d)  a ,  u,  u',  Uj, . . .  denote  elements  of  E*; 

(c)  a,P,i,<p,ip  denote  elements  of  (JV  U  E)*. 

Definition:  G  =  (N,E,R,S)  is  t-Jree  iff  for  every  (A, a)  6  R,  |a|  ^  0. 

Definition:  G  =  ( N,L,R,S )  is  branching  ifT  for  some  (A,  a)  G  R,  (aj  >  1. 

Definition:  G  --  (N,  E, R, S)  is  duplicate-free  iff  for  every  (A, a)  6  R,  a  =  Fi  -..Yn  and 
for  all  i,j  €  [l,n],  K,  =  Yj  iff  i  =  j. 

Definition:  G  =  (N,  E,  R,  S)  is  simple  iff  it  is  t-free,  duplicate-free,  and  branching. 

Note.  The  notion  of  a  simple  UCFG  is  introduced  in  order  to  help  pin  down  the  source  of 
any  computational  difficulties  associated  with  UCFGs.  For  example,  since  simple  UCFGs 
are  restricted  to  be  duplicate-free,  a  difficulty  that  arises  with  simple  UCFGs  cannot  result 
from  the  possibility  that  a  symbol  may  occur  more  than  once  on  the  right-hand  side  of  a 
rule. 

Definition:  <pAif>  =>  paip  (by  r)  just  m  case  (for  some)  r  =  {A', Yi... Yn)  G  R  and 
G 

for  some  permutation  p  of  [l,n],  A  =  A'  and  a  =  Y,{i) ..  ,Y,[n).  If  <p  e  E*,  also  write 
<pAtl>  <pa1>. 

Definition:  L(G)  =  {a  €  E* :  S  a). 


V  =  {v,w,x,y,x} 

E  —  {eii«J)«j*e4»e5.e*»er} 

with  the  t{  as  indicated 
fc  =  3 


Figure  5:  The  triple  (V,  E,  k)  is  an  instance  of  VERTEX  COVER.  The  set  V'  ~  {t»,  *,  *}  is 
a  vertex  cover  of  size  k  —  3. 


Definition:  An  n-step  derivation  of  tp  from  p  is  a  sequence  (pa, . .  .,pn)  such  that  po  =  p, 
pn  —  ip,  and  for  all  i  €  [0,  n  -  1],  Pi  =>  pi+i.  If  it  is  also  true  for  all »  that  pi  =>im  Pi+x, 
say  that  the  derivation  is  leftmott. 

0.2.  Defining  the  computational  problems 

Definition:  A  possible  instance  of  the  problem  VERTEX  COVER  is  a  triple  ( V,E,k ), 
where  (V,  E)  is  a  finite  graph  with  at  least  one  edge  and  at  least  two  vertices,  k  6  N,  and 
k  <  |  V |. 15  VERTEX  COVER  itself  consists  of  all  possible  instances  (y,E,k)  such  that  for 
some  V'  C  V ,  \V'\  <  k  and  for  all  edges  e  €  E,  at  least  one  endpoint  of  e  is  in  V'.  (Figure  5 
gives  an  example  of  a  VERTEX  COVER  instance.) 

Fact:  VERTEX  COVER  is  NP-complcte.  (Carey  and  Johnson,  1979:46) 

Definition:  A  possible  instance  of  the  problem  SIMPLE  UCFG  RECOGNITION  is  a  pair 
(G,a),  where  G  is  a  simple  UCFG  and  a  €  £*.  SIMPLE  UCFG  RECOGNITION  itself 
consists  of  all  possible  instances  {G,u)  such  that  a  G  L(G). 

Notation:  Take  ||  ||  to  be  any  reasonable  measure  of  the  encoded  input  length  for  a  com¬ 
putational  problem;  continue  to  use  |-|  for  set  cardinality  and  string  length.  It  is  reasonable 
to  require  that  if  S  is  a  set,  k  £  N,  and  |A|  >  k,  then  ||S||  >  jjfcjj;  that  is,  the  encoding  of 
,sT)ii»  formulation  differ*  trivially  from  the  one  cited  by  Garcy  and  Johnson. 
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numbers  is  better  than  unary.  It  is  also  reasonable  to  require  that  ||(. . .  ,*, . .  .)||  >  ||x||. 

0.3.  The  UCFG  recognition  problem  is  in  NP 

Lemma  0.1:  Let  (<p0, ...,  ys*)  be  a  shortest  leftmost  derivation  of  yj*  from  <po  in  a  branch¬ 
ing  t-free  UCFG.  If  fc  >  |AT|  +  1  then  \<pk\  >  fv3ot- 

Proof.  There  exists  some  sequence  of  rules  (Ao,ao)>"-’i(Ak-i><*k-i)  such  that  for  all 
«  €  [0,  k  —  1],  (pi  =>im  (pi+i  by  (Ai,ati).  Since  G  is  e-free,  |y>,+i|  >  |y>j|  always. 

Cate  1.  For  some  i,  |a,|  >  1.  Then  |y>,+  i|  >  |v?t- j.  Hence  \<pk\  >  \<po\' 

Cate  2.  For  every  i,  |a,j  =  1.  Then  there  exist  u,y  such  that  for  every  :  6  [0,  fc  -  2],  there 

is  A\  €  JV  such  that  <Pi+i  =  uA'fl.  Suppose  the  A\  arc  all  distinct.  Then  |AT|  >  fc  -  1, 

hence  |Af|  +  1  >  fc,  hence  |7V|  +  1  >  |Wj  +  1,  which  is  impossible.  Hence  for  some  i,j  6 
[0,fc  -  2j,  i  <  j,  A\  =  A'-.  Hence  <Pi+i  =  <pj+i,  since  [1,1]  has  only  one  permutation.  Then 
(<p0, . . .,<pi,<Pj+ 1, . . -,<Pk)  is  a  leftmost  derivation  of  <pt  from  <po  and  has  length  less  than 
fc,  which  is  also  impossible. 

Then  \pk  \  >  |^o|-  □ 

Corollary  0.2:  If  G  is  a  branching  e-free  UCFG  and  a  6  L(G)  then  a  has  a  leftmost 
derivation  of  length  at  most  j<r  j  ■  m,  where  m  =  |Af|  +  2. 

Proof.  Let  (yso, . .  -,<Pk)  be  a  shortest  leftmost  derivation  of  a  from  S.  Suppose  fc  >  |o|  •  m. 
Consider  the  sub- derivations 

fob  •••,¥>*») 

((Pm,  •  •  •  ilPSm) 

((P(|*|-I).«n>  •  •  •  -m) 

{'P\a\-mt  •  •  • » ’Pk)’ 

Each  one  except  the  last  has  m  steps  and  m  >  |AT|  +  1.  Then  by  lemma, 

|*V|.m|  >  |y»(M-  l)  m|  >  •••  >  \Vm\  >  \<Po\  =  1. 

Then  |<x|  >  1  +  M,  which  is  impossible.  Hence  fc  <  |er|  •  m.  Q 

Lemma  0.3:  n  =  SIMPLE  UCFG  RECOGNITION  is  in  the  computational  class  UP. 

Proof.  Let  G  =  (N,E,R,S)  be  a  simple  UCFG  and  a  €  E*.  Consider  the  following 
nondctcnninistic  algorithm  with  input  (G,  a): 

Step  1.  Write  down  <po  =  S. 

Step  2.  Perform  the  following  steps  for  i  from  0  to  |<r|  •  m  —  1,  where  m  =  |AT|  +  2. 

(a)  Express  <Pi  ns  u,/t,7,  by  finding  the  leftmost  nonterminal,  or  loop  if  impossible. 

(b)  Guess  a  rule  (^4,-,  F,,i . . .  K,,*, )  6  R  and  a  permutation  pi  of  [1,  fc<],  or  loop  if  there  is 
no  such  rule. 


»• 


(c)  Write  down  <fii+i  =  u.Y.  ^t) . . . 

(d)  If  Piri  =  a  then  halt. 

Step  3 .  Loop. 

It  should  be  apparent  that  the  algorithm  runs  in  time  at  worst  polynomial  in  ||{G,<r)||;  note 
that  the  length  of  ys,-  increases  by  at  most  a  constant  amount  on  each  iteration. 

Assume  (G.  <r)  6  II.  Then  a  has  a  leftmost  derivation  of  length  at  most  |<r|  •  m  by  Corol¬ 
lary  9.2;  hence  the  nondctcrministic  algorithm  will  be  able  to  guess  it  and  will  halt.  Con¬ 
versely.  suppose  the  algorithm  halts  on  input  (G,o).  On  the  iteration  when  the  algorithm 
halts,  the  sequence  (<po, . . .  will  constitute  a  leftmost  derivation  of  a  from  S\  hence 

a  €  L(G)  and  (G,o)  €  II. 

Then  there  is  a  nondctcrministic  algorithm  that  runs  in  polynomial  time  and  accepts  exactly 
II.  Hence  II  €  UP .  Q 

9.4.  The  UCFG  recognition  problem  is  NP-complete 

Lemma  9.4:  Let  { V ,  E.  k)  =  (V,  {e,-},  k)  be  a  possible  instance  of  VERTEX  COVER.  Then 
it  is  possible  to  construct,  in  time  polynomial  in  j|V||,  ||2?||,  and  k,  a  simple  UCFG  G{V,  E,  k) 
and  a  string  o(V,E,  k)  such  that 

(G{V,E,k),<r(V,E,k))  €  SIMPLE  UCFG  RECOGNITION 
iff  (V,  E,  k)  e  VERTEX  COVER. 

Proof.  Construct  G(V,  E,  k)  as  follows.  Let  the  set  N  of  nonterminals  consist  of  the  following 
symbols  not  in  V : 

START,  U,  D , 

Hi  for  »  €  [1,  |£|], 

Ui  for  «  e  [1,  \V\  -  fcj, 

Di  for  «  €  [1,  |£|  •  (k  -  1)]. 

j|iV||  will  be  at  worst  polynomial  in  ||E||,  ||V||,  and  k  for  a  reasonable  length  measure.  Define 
the  terminal  vocabulary  E  to  consist  of  subscripted  symbols  as  follows: 

E  =  {a,:aeK,«e[I,|^|]}. 

Designate  START  as  the  start  symbol.  Include  the  following  as  members  of  the  rule  set  R: 

(a)  Include  the  rule 

START  -*  Hi  ...H^E\Ui...Uiy\-kDi...D\E\.(k-iy 

(b)  For  each  e<  €  E,  include  the  rules 

{Hi  — *  :  a  an  endpoint  of  e,}. 
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START  — *  HiIIiHiHtHiHiHjUiUtDiDiDiDiDsDtDiDgDQDLQDuDijDifDH 


Hi 

— ♦ 

*»i  1 

Wi 

H * 

—  "J  I  Vt 

Hi 

Wi  |  X* 

Ht 

— * 

«»4 

1  *4 

Hi 

*5  |  VS 

Hi 

-*  y#  1  *• 

Hr 

-*• 

XT 

1  *1 

Vi 

— ♦ 

U 

Ut 

-  u 

Ui 

V 

Vt 

— ► 

V 

U  — ♦  V1V]V]V4VsVetl7  I  WiWjWjWitUsWfWj  |  XiZ]X|Z4Z{XeX7 

I  »lVi!fS»4Sf5»»V7  I  *1*5 *J *4 *5 *6 *1 


Di 

— ♦ 

D 

Di 

— ¥ 

D 

Di 

-»  D 

Di 

— ► 

D 

Di 

-► 

D 

Dt 

-  D 

Di 

D 

Dt 

— ¥ 

D 

D9 

-  D 

Dm 

D 

Du 

—* 

D 

Du 

-*  D 

Du 

— ► 

D 

Du 

— 

D 

D 

— ► 

1 

|  «J  |  »S  1  «4  |  «5  | 

«e  1  «t  1 

tU!  | 

Wi  (  Wi 

1  Wt  |  U»s  (  we  |  U»T 

1 

*i 

|  X,  |  XJ  |  X4  |  x5 

1  *4  1 

lyt 

1  v*  1  y*  1 

1  Vi  |  Vt  I  Vt  |  Vt 

I  *1  I  *2  I  *J  I  *4  I  *5  |  *«  |  *T 


Figure  6:  The  construction  of  Lemma  9.4  produces  this  grammar  when  applied  to  the 
VERTEX  COVER  problem  of  Figure  5.  The  /f-symbols  ensure  that  the  solution  that 
found  must  hit  each  of  the  edges,  while  the  If -symbols  ensure  that  enough  elements  of 
remain  untouched  to  satisfy  the  requirement  |V'|  <  fc.  The  D-symbols  are  dummies  that 
absorb  excess  input  symbols.  A  shorter  grammar  than  this  will  suffice  if  the  grammar  is 
not  required  to  be  duplicate-free. 


(c)  For  each  i  €  [1,  |V  j  -  fc],  include  the  rule  Ui  U .  Also  include  the  rules 


{U  — ♦  oj . .  .0|ff| :  o  €  V}. 


(d)  For  each  i  6  [1,  |£|  •  (fc  -  1)],  include  the  rule  D,  — *  D.  Also  include  the  rules 

{ D  -»  a:  a  €  £}. 


Take  G(V,E,k)  to  be  (N,  E,  R,  START).  (Figure  0  shows  the  result  of  applying  this  con¬ 
struction  to  the  VERTEX  COVER  instance  of  Figure  5.) 


^  B- 


Let  h  :  [l,  | V |]  — »  V  be  some  standard  enumeration  of  the  elements  of  V.  Construct 
<r(V,E,k)  as  h(l)t . . .  A(%, . .  .h(|V|)i .. . h(|Vj)|E|;  thus  o(V, E, k)  will  have  length  \E\  ■ 

n 

It  is  easy  to  sec  that  \\(G(V,  E,k),a(V,  E,k))\\  will  be  at  worst  polynomial  in  ||E|(,  ||V||, 
and  k  for  reasonable  ||-||.  It  will  also  be  possible  to  construct  the  grammar  and  string  in 
polynomial  time.  Finally,  note  that  given  the  definition  of  a  possible  instance  of  VERTEX 
COVER,  the  grammar  will  be  branching,  c- free,  and  duplicate-free,  hence  simple. 

Now  suppose  (V,  E,  k)  €  VERTEX  COVER.  Then  there  exist  V'  C  V  and  /  :  E  -  V  such 
that  |V'|  <  k  mid  for  every  e  €  E,  /(e)  is  an  endpoint  of  e.  E  is  nonempty  by  hypothesis 
and  V'  must  hit  every  edge,  hence  |V'|  cannot  be  aero.  Construct  a  parse  tree  for  o[V,  E,k) 
according  to  G(V,E,k )  as  follows. 

Step  1.  Number  the  elements  of  V  —  V'  as  {z, :  i  £  [1,  (V  -  V"|]}.  For  each  z,-  where 
i  <  \V\  —  k,  construct  a  node  dominating  the  substring  (x,)i .  of  o{V,E,k)  and 

label  it  U.  Then  construct  a  node  dominating  only  the  U- node  and  label  it  U{.  Note  that 
the  available  symbols  U ,  are  numbered  from  1  to  |V|  —  k,  so  it  is  impossible  to  run  out  of 
{/-symbols.  Also,  |V'|  <  k  and  V'  C  V,  hence  jV  -  V'\  <  |V|  -  k,  so  all  of  the  {/-symbols 
will  be  used.  Finally,  note  that  U  -*  ai . .  .ajE|  is  a  rule  for  any  a  £  S  and  that  U,  — »  U  is 
a  rule  for  any  {/,•. 

Step  2.  For  each  e,  £  E,  construct  a  node  dominating  the  (unique)  occurrence  of  /(e,),  in 
o(V,E,k)  and  label  it  H{.  Step  2  cannot  conflict  with  step  1  because  /(e,)  £  V',  hence 
/(e,)  $lV  —  V',  Different  parts  of  step  2  cannot  conflict  with  each  other  because  each  one 
affects  a  symbol  with  a  different  subscript.  Also  note  that  /(e*)  is  an  endpoint  of  e,-  and 
that  Hj  -*  a,  is  a  rule  for  any  e,  €  E  and  a  an  endpoint  of  e,-. 

Step  S.  Number  all  occurrences  of  terminals  in  o(V,E,k)  that  were  not  attached  in  step  1 
or  step  2.  For  the  ith  such  occurrence,  construct  a  node  dominating  the  occurrence  and 
label  it  D.  Then  construct  another  node  dominating  the  D-node  and  label  it  D,-.  Note 
that  the  stock  of  D-symbols  runs  from  1  to  [k  -  1)  •  \E\.  Exactly  (|V|  -  k)  ■  \E\  symbols 
of  <r(V,  E,k)  were  accounted  for  in  step  I.  Also,  exactly  |E|  symbols  were  accounted  for  in 
step  2.  The  length  of  <r(V,  E,k)  is  jVj  •  |E|,  hence  exactly 

|V|-|E|-(|V|-fc).|E|)-|E|  =  \V\-\E\-\V\.\E\  +  k-\E\-\E\ 

=  (*-!)•  1^1 

symbols  remain  at  the  beginning  of  step  3.  D  — »  a  is  a  rule  for  any  a  £  E;  D,  — *  D  is  a 
rule  for  any  D{. 

Step  4 ■  Finally,  construct  a  node  labeled  START  that  dominates  all  of  the  //<,  //,-,  and  D,- 
nodes  constructed  in  steps  1,  2,  and  3.  The  rule 

START  Hi.  ..H\E\Ui---U\v\-kDi 

is  in  the  grammar.  Note  also  that  nodes  labeled  Hi, . . . ,  H,e\  were  constructed  in  step  2, 
nodes  labeled  Vi,...,V\y\  *  were  constructed  in  step  1,  and  nodes  labeled  Dj, . . .,  D  E-(k -1) 
were  constructed  in  step  3.  Hence  the  application  of  the  rule  is  in  accord  with  the  grammar. 


Hi  HtDtDvDtDiDu  U\  DuDiHuDtHtDaDta  Ut  DuDuDuHiDuHtHj 


«I  ««  «*  «•  *»T  Wl»»H»jW4lW5U>«W7*i  Xj  *j  X«  *»  *6  *T  Vl  VJjfll/4  VSV«V7  *1  *t  *»  *4  **  *•  *T 


Figure  7:  This  pane  tree  shows  how  the  grammar  shown  in  Figure  6  can  generate  the  string 
a(V,  E,  k)  constructed  in  Lemma  9.4  for  the  VERTEX  COVER  problem  of  Figure  5.  The 
corresponding  VERTEX  COVER  solution  V  —  {v,x,z)  and  its  intersection  with  the  edges 
can  be  read  off  by  noticing  which  terminals  the  //-symbols  dominate. 


Then  cr(V,  E,  k)  G  E(G).  (Figure  7  illustrates  the  application  of  this  parse-tree  construction 
procedure  to  the  grammar  and  input  string  derived  from  the  VERTEX  COVER  example 
in  Figure  5.) 

Conversely,  suppose  o(V,E,k)  G  £((?).  Then  the  derivation  of  a(V,E,k )  from  START 
must  begin  with  the  application  of  the  rule 

START  Hi... . ■ . tfjv'i— t-Di  •  •  ■  •D|e|  (*-i) 

and  each  Hi  must  later  be  expanded  as  some  subscripted  terminal  g(Hi).  Define  /(e<)  to 
be  g(Hi)  without  the  subscript;  then  by  construction  of  the  grammar,  /(e,)  is  an  endpoint 
of  e,  for  all  e<  G  E.  Define  V'  -  {/(«») :  e<  G  £};  then  it  is  apparent  that  V'  C  V  and  that 
V'  contains  at  least  one  endpoint  of  e,-  for  all  e,-  €  E.  Also,  each  U{  for  t  6  [1, \V\  —  A:] 
must  be  expanded  as  U,  then  as  some  substring  (o,)i . . .  (<i|)|£|  of  a(V,  E,  fc).1®  Since  the 
substrings  dominated  by  the  Hi  and  f/,  must  all  be  disjoint,  and  since  there  are  only  |E| 
subscripted  occurrences  of  any  single  symbol  from  V  in  a(V,  E,  k),  there  must  be  ( V |  —  k 
distinct  elements  of  V  that  are  not  dominated  in  any  of  their  subscripted  versions  by  any 
Hi .  Then  |V  —  V'\  >  |Vj  -  k.  Since  in  addition  V  C  V',  |V'|  <  k.  Then  (V,E,k)  6 
VERTEX  COVER.  □ 

Theorem  1:  SIMPLE  UCFG  RECOGNITION  is  NP-compIcte. 

Proof.  SIMPLE  UCFG  RECOGNITION  is  in  the  class  UP  by  Lemma  9.3,  hence  a  poly¬ 
nomial-time  reduction  of  VERTEX  COVER  to  SIMPLE  UCFG  RECOGNITION  is  suffi¬ 
cient.  Let  (V,  E,  k)  be  a  possible  instance  of  VERTEX  COVER.  Let  G  be  G(V,  E,k)  and  a 
be  <r(  V,  E,  k)  as  constructed  in  Lemma  9.4.  Note  that  G  is  simple. 

The  construction  of  G  and  a  can,  by  lemma,  be  carried  out  at  time  at  worst  polyno- 
mud  in  ||£||,  ||V||,  and  k.  Also  by  lemma,  (G,a)  G  SIMPLE  UCFG  RECOGNITION 
iff  (V,E,k)  G  VERTEX  COVER,  k  is  not  polynomial  in  |jfc|j  under  a  reasonable  encoding 
scheme.  However,  \E\  >  k,  hence  ||£||  >  ||fc||;  also  }|(V,  E,  fc)||  >  ||£||,  hence  ||(V,  E,  fc)||  >  fc, 
all  by  properties  assumed  to  hold  of  ||-||.  Then  G  and  a  can  in  fact  be  constructed  in  time 
at  worst  polynomial  in  || (V,  E,fc)||. 

Ilencc  the  VERTEX  COVER  problem  is  polynomial-time  reduced  to  SIMPLE  UCFG 
RECOGNITION.  □ 


leTlie  grammar  would  allow  the  nubstriug  (a,  j i  . . .  (o, ) . £  to  appear  in  any  permutation,  but  in  o (K, E,  t) 
it  appeiirs  only  in  the  indicated  order. 
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